Diffusion Approximations for the Constant Step Size Backpropagation Algorithm and Resistance to Local Minima

نویسنده

  • William Finnoff
چکیده

In this paper we discuss the asymptotic properties of the most commonly used variant of the backpropagation algorithm in which network weights are trained by means of a local gradient descent on examples drawn randomly from a fixed training set, and the learning rate TJ of the gradient updates is held constant (simple backpropagation). Using stochastic approximation results, we show that for TJ ~ 0 this training process approaches a batch training and provide results on the rate of convergence. Further, we show that for small TJ one can approximate simple back propagation by the sum of a batch training process and a Gaussian diffusion which is the unique solution to a linear stochastic differential equation. Using this approximation we indicate the reasons why simple backpropagation is less likely to get stuck in local minima than the batch training process and demonstrate this empirically on a number of examples.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Mappable Nonlinearities in Robustness Analysis - Automatic Control, IEEE Transactions on

K. Hornik, M. Stinchombe, and H. White, “Multilayer feedforward networks are universal approximators,” Neural Networks, vol. 2, pp. 359-366, 1989. A. Pinkus, N Widths in Approximation Theory. New York: SpringerVerlag, 1986. F. Girosi, “Regularization theory, radial basis functions and networks,” From Statistics to Neural Networks. Theory and Pattem Recognition Applications, V. Cherkassky, J. H....

متن کامل

Learning Algorithms for Connectionist Networks: Applied Gradient Methods of Nonlinear Optimization

The problem of learning using connectionist networks, in which network connection strengths are modified systematically so that the response of the network increasingly approximates the desired response can be structured as an optimization problem. The widely used back propagation method of connectionist learning [19, 21, 18] is set in the context of nonlinear optimization. In this framework, t...

متن کامل

Monotone corrections for generic cell-centered finite volume approximations of anisotropic diffusion equations

We present a nonlinear technique to correct a general Finite Volume scheme for anisotropic diffusion problems, which provides a discrete maximum principle. We point out general properties satisfied by many Finite Volume schemes and prove the proposed corrections also preserve these properties. We then study two specific corrections proving, under numerical assumptions, that the corresponding ap...

متن کامل

Training neural networks with additive noise in the desired signal

A new global optimization strategy for training adaptive systems such as neural networks and adaptive filters [finite or infinite impulse response (FIR or IIR)] is proposed in this paper. Instead of adding random noise to the weights as proposed in the past, additive random noise is injected directly into the desired signal. Experimental results show that this procedure also speeds up greatly t...

متن کامل

Kinetics of chemical leaching of chalcocite from low-grade copper ore: size-distribution behavior

Kinetics of the chemical leaching of chalcocite from a low-grade copper ore in a ferric sulfate medium was investigated using the constrained least square optimization technique. The experiments were carried out for different particle sizes in both the reactor and column at constant Eh, pH, and temperature. The leaching rate increased with increase in the temperature. About 50% of the Cu recove...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1992